5 research outputs found

    A Bi-Directional GRU Architecture for the Self-Attention Mechanism: An Adaptable, Multi-Layered Approach with Blend of Word Embedding

    Get PDF
    Sentiment analysis (SA) has become an essential component of natural language processing (NLP) with numerous practical applications to understanding “what other people think”. Various techniques have been developed to tackle SA using deep learning (DL); however, current research lacks comprehensive strategies incorporating multiple-word embeddings. This study proposes a self-attention mechanism that leverages DL and involves the contextual integration of word embedding with a time-dispersed bidirectional gated recurrent unit (Bi-GRU). This work employs word embedding approaches GloVe, word2vec, and fastText to achieve better predictive capabilities. By integrating these techniques, the study aims to improve the classifier’s capability to precisely analyze and categorize sentiments in textual data from the domain of movies. The investigation seeks to enhance the classifier’s performance in NLP tasks by addressing the challenges of underfitting and overfitting in DL. To evaluate the model’s effectiveness, an openly available IMDb dataset was utilized, achieving a remarkable testing accuracy of 99.70%

    A Novel Paradigm for Sentiment Analysis on COVID-19 Tweets with Transfer Learning Based Fine-Tuned BERT

    Get PDF
    The rapid escalation in global COVID-19 cases has engendered profound emotions of fear, agitation, and despondency within society. It is evident from COVID-19-related tweets that spark panic and elevate stress among individuals. Analyzing the sentiment expressed in online comments aids various stakeholders in monitoring the situation. This research aims to improve the performance of pre-trained bidirectional encoder representations from transformers (BERT) by employing transfer learning (TL) and fine hyper-parameter tuning (FT). The model is applied to three distinct COVID-19-related datasets, and each of the datasets belongs to a different class. The evaluation of the model’s performance involves six different machine learning (ML) classification models. This model is trained and evaluated using metrics such as accuracy, precision, recall, and F1-score. Heat maps are generated for each model to visualize the results. The performance of the model demonstrates accuracies of 83%, 97%, and 98% for Class-5, Class-3, and binary classifications, respectively

    Influence of Pre-Processing Strategies on the Performance of ML Classifiers Exploiting TF-IDF and BOW Features

    Get PDF
    Data analytics and its associated applications have recently become impor-tant fields of study. The subject of concern for researchers now-a-days is a massive amount of data produced every minute and second as people con-stantly sharing thoughts, opinions about things that are associated with them. Social media info, however, is still unstructured, disseminated and hard to handle and need to be developed a strong foundation so that they can be utilized as valuable information on a particular topic. Processing such unstructured data in this area in terms of noise, co-relevance, emoticons, folksonomies and slangs is really quite challenging and therefore requires proper data pre-processing before getting the right sentiments. The dataset is extracted from Kaggle and Twitter, pre-processing performed using NLTK and Scikit-learn and features selection and extraction is done for Bag of Words (BOW), Term Frequency (TF) and Inverse Document Frequency (IDF) scheme. /nFor polarity identification, we evaluated five different Machine Learning (ML) algorithms viz Multinomial Naive Bayes (MNB), Logistic Regression (LR), Decision Trees (DT), XGBoost (XGB) and Support Vector Machines (SVM). We have performed a comparative analysis of the success for these algorithms in order to decide which algorithm works best for the given data-set in terms of recall, accuracy, F1-score and precision. We assess the effects of various pre-processing techniques on two datasets; one with domain and other not. It is demonstrated that SVM classifier outperformed the other classifiers with superior evaluations of 73.12% and 94.91% for accuracy and precision respectively. It is also highlighted in this research that the selection and representation of features along with various pre-processing techniques have a positive impact on the performance of the classification. The ultimate outcome indicates an improvement in sentiment classification and we noted that pre-processing approaches obviously suggest an improvement in the efficiency of the classifiers

    A Novel Paradigm for Sentiment Analysis on COVID-19 Tweets with Transfer Learning Based Fine-Tuned BERT

    Get PDF
    The rapid escalation in global COVID-19 cases has engendered profound emotions of fear, agitation, and despondency within society. It is evident from COVID-19-related tweets that spark panic and elevate stress among individuals. Analyzing the sentiment expressed in online comments aids various stakeholders in monitoring the situation. This research aims to improve the performance of pre-trained bidirectional encoder representations from transformers (BERT) by employing transfer learning (TL) and fine hyper-parameter tuning (FT). The model is applied to three distinct COVID-19-related datasets, and each of the datasets belongs to a different class. The evaluation of the model’s performance involves six different machine learning (ML) classification models. This model is trained and evaluated using metrics such as accuracy, precision, recall, and F1-score. Heat maps are generated for each model to visualize the results. The performance of the model demonstrates accuracies of 83%, 97%, and 98% for Class-5, Class-3, and binary classifications, respectively
    corecore